Project 00: Food Vision Big¶
This notebook is sourced from Daniel Bourke's ZTM: Tensorflow Developer Certificate Course on Udemy, specifically this one.
In this notebook, we're going to build a Transfer Learning Model using data from the Food101 dataset.
We've got a goal of beating DeepFood, a 2016 paper which uses Convolutional Neural Network trained for 2-3 days, to acheive 77.4% top-1 accuracy
🔑 Note: Top-1 accuracy means "accuracy for the top softmax activation value output by the model"
| Food Vision Big | |
|---|---|
| Dataset Source | TensorFlow datasets |
| Train Data | 75,750 images |
| Mixed Precision | yes |
| Data Loading | Perfomant tf.data API |
| Target Results | 77.4% top-1 accuracy |
Check GPU¶
We're going to be using mixed precision training.
For Mixed Precision training to work, you need to access a GPU with compute capability of 7.0+.
# Check GPU
!nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 3060 Ti (UUID: GPU-4db4d962-71a3-5709-c39b-2fda5802e9d6)
# Check tensorflow version and allocate maximum possible memory from GPU
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
print(tf.__version__)
# gpus = tf.config.list_physical_devices('GPU')
# if gpus:
# # Restrict TensorFlow to only allocate 1GB of memory on the first GPU
# try:
# tf.config.set_logical_device_configuration(
# gpus[0],
# [tf.config.LogicalDeviceConfiguration(memory_limit=7292)])
# logical_gpus = tf.config.list_logical_devices('GPU')
# print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
# except RuntimeError as e:
# # Virtual devices must be set before GPUs have been initialized
# print(e)
2.8.0
Get helper functions¶
We've created a series of helper functions to help us do small tasks that we functionzied as we gained experience on building models.
# Get helper functions
import os
if not os.path.exists("helper_functions.py"):
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py
else:
print("[INFO] 'helper_functions.py' already exists, skipping download...")
[INFO] 'helper_functions.py' already exists, skipping download...
# Import series of helper functions for this notebook
from helper_functions import create_tensorboard_callback, plot_loss_curves, compare_historys
Use TensorFlow Datasets to Download Data¶
For many of the most popular datasets in the machine learning world, you can access them through TensorFlow Datasets (TFDS).
# Get TensorFlow Datasets
import tensorflow_datasets as tfds
# List available datasets
datasets_list = tfds.list_builders() # get all available datasets in TFDS
print("food101" in datasets_list) # is the dataset we're after available?
True
It looks like the dataset we're after is available.
To get access to the Food101 dataset from TFDS, we can use the tfds.load() method.
In particular we'll have to pass it a few parameters to let it know what we're after:
name: the target datasetsplit: what splits of the dataset are we after- The
splitparameter is quite tricky. See the documentation for more.
- The
shuffle_files: whether or not to shuffle the files on Download, defualts toFalse.as_supervised:Trueto download data samples in tuple format ((data, label)) orFalsefor dictionary format.with_info:Trueto download dataset metadat
# Load the data
(train_data, test_data), ds_info = tfds.load(name="food101",
split=["train", "validation"],
shuffle_files=True,
as_supervised=True,
with_info=True)
# Features of Food101 dataset
ds_info.features
FeaturesDict({
'image': Image(shape=(None, None, 3), dtype=tf.uint8),
'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=101),
})
# Get class names
class_names = ds_info.features['label'].names
class_names[:10]
['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito']
Exploring the Food101 data from TensorFlow Datasets¶
Now we've downloaded the Food101 dataset from TensorFlow Datasets, how about we do what any good data explorer should?
In other words, "visualize, visualize, visualize".
Let's find out a few details about our dataset:
- The shape of our input data (image tensors)
- The datatype of our input data
- What the labels of our input data look like (e.g. one-hot encoded versus label-encoded)
- Do the labels match up with the class names?
To do, let's take one sample off the training data (using the .take() method) and explore it.
# Take one sample off the training data
train_one_sample = train_data.take(1)
train_one_sample
<TakeDataset element_spec=(TensorSpec(shape=(None, None, 3), dtype=tf.uint8, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>
# Looping through our training sample and output info about the same
for image, label in train_one_sample:
print(f"""
Image shape: {image.shape}
Image dtype: {image.dtype}
Target class from Food101 (tensor format): {label}
Class name (str form): {class_names[label.numpy()]}
""")
Image shape: (512, 289, 3)
Image dtype: <dtype: 'uint8'>
Target class from Food101 (tensor format): 45
Class name (str form): frozen_yogurt
# What does an image tensor from tfds's food101 look like?
image
<tf.Tensor: shape=(512, 289, 3), dtype=uint8, numpy=
array([[[145, 151, 125],
[157, 163, 137],
[159, 165, 139],
...,
[197, 209, 199],
[197, 209, 199],
[197, 207, 198]],
[[146, 152, 126],
[156, 162, 136],
[158, 164, 138],
...,
[197, 209, 199],
[197, 209, 199],
[199, 209, 200]],
[[150, 156, 130],
[155, 161, 135],
[157, 163, 137],
...,
[197, 209, 199],
[197, 209, 199],
[199, 209, 198]],
...,
[[173, 186, 177],
[174, 187, 178],
[176, 189, 180],
...,
[181, 197, 187],
[181, 197, 187],
[178, 195, 187]],
[[176, 187, 179],
[176, 187, 179],
[176, 189, 180],
...,
[182, 198, 188],
[182, 198, 188],
[180, 197, 189]],
[[178, 189, 181],
[177, 188, 180],
[175, 188, 179],
...,
[179, 195, 185],
[179, 195, 185],
[179, 196, 190]]], dtype=uint8)>
# What are the min and max values?
tf.reduce_max(image), tf.reduce_min(image)
(<tf.Tensor: shape=(), dtype=uint8, numpy=255>, <tf.Tensor: shape=(), dtype=uint8, numpy=0>)
Plot images from TensorFlow Datasets Food101¶
# Plot multiple image samples using matplotlib and set the title to target class name
# Function to visualize some random images from the dataset
import matplotlib.pyplot as plt
# Get factors.py
import os
if not os.path.exists("factors.py"):
!wget https://raw.githubusercontent.com/ARGF0RCE/Food-Vision/main/factors.py
else:
print("[INFO] 'factors.py' already exists, skipping download...")
from factors import *
def plot_random_images(dataset=train_data, figsize=(20, 20), no_of_samples=10):
"""Function to visualize a set of andom images sampled
dataset.
Args:
dataset (<_OptionsDataset shapes: ((None, None, 3), ()), types: (tf.uint8, tf.int64)>, optional): Desired dataset. Defaults to train_data.
figsize (tuple, optional): Desired figsize. Defaults to (5, 5).
no_of_samples (int, optional): No. of samples to visualize. Defaults to 10.
"""
image_tensors = []
labels = []
# global figure_size
# if no_of_samples != 10:
# figure_size = (int(no_of_samples/2), int(no_of_samples/2))
# else:
# figure_size = figsize
plt.figure(figsize=figsize)
samples = dataset.take(no_of_samples)
for image, label in samples:
# print(f"""
# Image shape: {image.shape}
# Image datatype: {image.dtype}
# Target class from Food101 (tensor form): {label}
# Class name (str form): {class_names[label.numpy()]}
# """)
image_tensors.append(image)
labels.append(class_names[label.numpy()])
for i, j in enumerate(image_tensors):
plt.subplot(median(factorization(no_of_samples))[0], median(factorization(no_of_samples))[1], i + 1)
plt.imshow(j)
plt.title(labels[i])
plt.axis(False);
plot_random_images(no_of_samples=20)
[INFO] 'factors.py' already exists, skipping download...
Now let's preprocess it and get it ready for use with a neural network
Create Preprocessing functions for our data¶
Since we've downloaded the data from TensorFlow Datasets, there are a couple of preprocessing steps we have to take before it's ready to model.
More specifically our data is currently:
- In
unint8datatype. - Comprised of different sized tensors.
- Not scaled (the pixel values are between 0 & 255)
To take care of the preprocessing, we'll create a preprocess_img() function which:
- Resizes an input image tensor to a specified size using
tf.image.resize() - Converts an input image tensor's current datatype to tf.float32 using
tf.cast()
🔑 Note: Pretrained EfficientNetBX models in
tf.keras.applications.efficientnethave rescaling built-in. But for many other model architectures you'll want to rescale yoour data.
# Make a function for preprocessing images
def preprocess_img(image, label, img_shape=224):
"""
Converts image datatype from `uint8` -> `float32` and reshape image to
`[img_shape, img_shape, color_channels]`
"""
image = tf.image.resize(image, [img_shape, img_shape]) # reshape to img_shape
return tf.cast(image, tf.float32), label # return (float32_image, label) tuple
# Preprocess a single sample image and check the outputs
preprocessed_img = preprocess_img(image, label)[0]
print(f"Image before preprocessing:\n {image[:2]}...,\nShape: {image.shape},\nDatatype: {image.dtype}\n")
print(f"Image after preprocessing:\n {preprocessed_img[:2]}...,\nShape: {preprocessed_img.shape},\nDatatype: {preprocessed_img.dtype}")
Image before preprocessing: [[[145 151 125] [157 163 137] [159 165 139] ... [197 209 199] [197 209 199] [197 207 198]] [[146 152 126] [156 162 136] [158 164 138] ... [197 209 199] [197 209 199] [199 209 200]]]..., Shape: (512, 289, 3), Datatype: <dtype: 'uint8'> Image after preprocessing: [[[147.19739 153.19739 127.19739] [157.22768 163.22768 137.22768] [158.09805 164.09805 138.09805] ... [197. 209. 197.54907] [197. 209. 199. ] [198.09914 208.38936 199.24425]] [[154.69467 160.69467 134.69467] [155.87053 161.87053 135.87053] [158.39908 164.39908 138.39908] ... [197. 209. 197.54907] [197. 209. 199. ] [196.32831 206.61853 195.76364]]]..., Shape: (224, 224, 3), Datatype: <dtype: 'float32'>
# We can still plot our preprocessed image as long as we
# divide by 255 (for matplotlib capatibility)
plt.imshow(preprocessed_img/255.)
plt.title(class_names[label])
plt.axis(False);
Batch & prepare datasets¶
Before we can model our data, we have to turn it into batches.
To do this in an effective way, we're going to be leveraging a number of methods from the tf.data API.
📖 Resource: For loading data in the most performant way possible, see the TensorFlow Doccumentation on Better perfomance with the
tf.dataAPI
Specifically we're going to be using:
map()shuffle()batch()prefetch()- (Extra)
cache()
Things to note:
Can't batch tensors of different shapes (e.g. different image sizes, need to reshape images first, hence our
preprocess_img()function)shuffle()keeps a buffer of the number you pass it images shuffled, ideally this number would be all of the samples in your training set, however, if your training set is large, this buffer might not fit in memory (a fairly large number like 1000 or 10000 is usually suffice for shuffling)For methods with the num_parallel_calls parameter available (such as
map()), setting it tonum_parallel_calls=tf.data.AUTOTUNEwill parallelize preprocessing and significantly improve speedCan't use
cache()unless your dataset can fit in memory

Source: Page 422 of Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow Book by Aurélien Géron
# Map preprocesing function to training data and parallelize
train_data = train_data.map(map_func=preprocess_img, num_parallel_calls=tf.data.AUTOTUNE)
# Shuffle train data and turn into batches and prefetch it.
train_data = train_data.shuffle(buffer_size=1000).batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)
# Map preprocesing function to training data and parallelize
test_data = test_data.map(map_func=preprocess_img, num_parallel_calls=tf.data.AUTOTUNE)
# Shuffle train data and turn into batches and prefetch it.
test_data = test_data.batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)
train_data, test_data
(<PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int64, name=None))>, <PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int64, name=None))>)
Create modelling callbacks¶
Since we're going to be training on large amount of data and training could take a long time, it's a good idea to set up some modelling callbacks so we be sure of things like our model's training logs being tracked and our model being checkpointed after various training milestones.
We'll be using the following callbacks:
tf.keras.callbacks.TensorBoard()tf.keras.callbacks.ModelCheckpoint()
# Create TensorBoard callback
from helper_functions import create_tensorboard_callback
# Create ModelCheckpoint callback
checkpoint_path = "food_vision_model_checkpoints/ckpt"
model_chekpoint = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
monitor="val_accuracy",
save_best_only=False,
save_weights_only=True,
verbose=1)
Setup mixed precision training¶
Using mixed precision training can improve your performance on modern GPUs (those with a compute capability score of 7.0+) by up to 3x.
For a more detailed explanation, I encourage you to read through the TensorFlow mixed precision guide (I'd highly recommend at least checking out the summary).
# Turn on mixed precision training
from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy(policy="mixed_float16")
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK Your GPU will likely run quickly with dtype policy mixed_float16 as it has compute capability of at least 7.0. Your GPU: NVIDIA GeForce RTX 3060 Ti, compute capability 8.6
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK Your GPU will likely run quickly with dtype policy mixed_float16 as it has compute capability of at least 7.0. Your GPU: NVIDIA GeForce RTX 3060 Ti, compute capability 8.6
Build feature extraction model¶
To build the feature extraction model, we'll:
- Use
EfficientNetB0fromtf.keras.applicationspre-trained on ImageNet as our base model.- We'll download this model without the top layers using
include_top=Falseparameter so we can create our own output layers.
- We'll download this model without the top layers using
- Freeze the base model layers so we can use the pre-trained patterns the base model has found while trained on ImageNet.
- Put tougether the input, base model, pooling and output layers in a functional model
- Compile the functional model using the Adam optimizer and sparse categorical crossentropy as the loss function
- Fit the model for 3 epochs using TensorBoard and ModelCheckpoint callbacks.
from tensorflow.keras import layers
# Create base model
input_shape = (224, 224, 3)
base_model = tf.keras.applications.EfficientNetB0(include_top=False)
base_model.trainable = False
# Create functional model
inputs = layers.Input(shape=input_shape, name="input_layer")
x = base_model(inputs, training=False)
x = layers.GlobalAveragePooling2D(name="pooling_layer")(x)
x = layers.Dense(len(class_names))(x)
outputs = layers.Activation("softmax", dtype=tf.float32, name="softmax_float32")(x)
model = tf.keras.Model(inputs, outputs, name="feature_extract_model")
# Compile the model
model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Check model summary
model.summary()
Model: "feature_extract_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 224, 224, 3)] 0
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
pooling_layer (GlobalAverag (None, 1280) 0
ePooling2D)
dense (Dense) (None, 101) 129381
softmax_float32 (Activation (None, 101) 0
)
=================================================================
Total params: 4,178,952
Trainable params: 129,381
Non-trainable params: 4,049,571
_________________________________________________________________
Checking layer dtype policies¶
We can check whether the model is using mixed_precision policy by iterating through our model's layers and printing layer attributes such as dtype and dtype_policy.
for layer in model.layers:
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_layer True float32 <Policy "float32"> efficientnetb0 False float32 <Policy "mixed_float16"> pooling_layer True float32 <Policy "mixed_float16"> dense True float32 <Policy "mixed_float16"> softmax_float32 True float32 <Policy "float32">
# Check the layers iin the base model
for layer in model.layers[1].layers[:20]:
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_1 False float32 <Policy "float32"> rescaling False float32 <Policy "mixed_float16"> normalization False float32 <Policy "mixed_float16"> stem_conv_pad False float32 <Policy "mixed_float16"> stem_conv False float32 <Policy "mixed_float16"> stem_bn False float32 <Policy "mixed_float16"> stem_activation False float32 <Policy "mixed_float16"> block1a_dwconv False float32 <Policy "mixed_float16"> block1a_bn False float32 <Policy "mixed_float16"> block1a_activation False float32 <Policy "mixed_float16"> block1a_se_squeeze False float32 <Policy "mixed_float16"> block1a_se_reshape False float32 <Policy "mixed_float16"> block1a_se_reduce False float32 <Policy "mixed_float16"> block1a_se_expand False float32 <Policy "mixed_float16"> block1a_se_excite False float32 <Policy "mixed_float16"> block1a_project_conv False float32 <Policy "mixed_float16"> block1a_project_bn False float32 <Policy "mixed_float16"> block2a_expand_conv False float32 <Policy "mixed_float16"> block2a_expand_bn False float32 <Policy "mixed_float16"> block2a_expand_activation False float32 <Policy "mixed_float16">
Fit the feature extraction model¶
To save time per epoch, we'll only validate on 15% of the test data.
# Turn off all warnings except for errors
tf.get_logger().setLevel('ERROR')
# Fit the model with callbacks
history_food_vision_big_feature_extract_model = model.fit(train_data,
epochs=3,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=int(0.15 * len(test_data)),
callbacks=[create_tensorboard_callback("training_logs_food_vision",
"efficientnetb0_all_data_feature_extract"),
model_chekpoint])
Saving TensorBoard log files to: training_logs_food_vision/efficientnetb0_all_data_feature_extract/20221216-011240 Epoch 1/3 2367/2368 [============================>.] - ETA: 0s - loss: 1.8257 - accuracy: 0.5568 Epoch 1: saving model to food_vision_model_checkpoints/ckpt 2368/2368 [==============================] - 68s 27ms/step - loss: 1.8256 - accuracy: 0.5568 - val_loss: 1.2273 - val_accuracy: 0.6719 Epoch 2/3 2367/2368 [============================>.] - ETA: 0s - loss: 1.2930 - accuracy: 0.6667 Epoch 2: saving model to food_vision_model_checkpoints/ckpt 2368/2368 [==============================] - 63s 27ms/step - loss: 1.2931 - accuracy: 0.6667 - val_loss: 1.1280 - val_accuracy: 0.6962 Epoch 3/3 2368/2368 [==============================] - ETA: 0s - loss: 1.1426 - accuracy: 0.7014 Epoch 3: saving model to food_vision_model_checkpoints/ckpt 2368/2368 [==============================] - 63s 27ms/step - loss: 1.1426 - accuracy: 0.7014 - val_loss: 1.0858 - val_accuracy: 0.7068
# Evaluate model on whole test dataset
results_feature_extract_model = model.evaluate(test_data)
results_feature_extract_model
790/790 [==============================] - 17s 22ms/step - loss: 1.0897 - accuracy: 0.7090
[1.0897213220596313, 0.7090296745300293]
Load and evaluate checkpoint weights¶
We can load and evaluate the model's checkpoints by:
- Cloning our model using
tf.keras.models.clone_model()to make a copy of our feature extraction model with reset weights - Calling the
load_weights()method on oir cloned model passing it the path where our checkpointed weights are stored. - Calling
evaluate()on the cloned model with loaded weights.
# Clone the model we created
cloned_model = tf.keras.models.clone_model(model)
cloned_model.summary()
Model: "feature_extract_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 224, 224, 3)] 0
efficientnetb0 (Functional) (None, None, None, 1280) 4049571
pooling_layer (GlobalAverag (None, 1280) 0
ePooling2D)
dense (Dense) (None, 101) 129381
softmax_float32 (Activation (None, 101) 0
)
=================================================================
Total params: 4,178,952
Trainable params: 129,381
Non-trainable params: 4,049,571
_________________________________________________________________
# Load checkpointed weights into cloned model
cloned_model.load_weights(checkpoint_path)
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f3790367dc0>
# Compile cloned_model
cloned_model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
results_cloned_model = cloned_model.evaluate(test_data)
790/790 [==============================] - 19s 22ms/step - loss: 1.7244 - accuracy: 0.5512
# Loaded checkpoint weights should return very similar results to checkpoint weights prior to saving
import numpy as np
np.isclose(results_feature_extract_model, results_cloned_model).all() # check if all elements in array are close
False
Save the whole model to file¶
We can also save the whole model using the save() method
model.save('food_vision_model_feature_extract.h5', overwrite=True)
# Load model
saved_model = tf.keras.models.load_model('food_vision_model_feature_extract.h5')
# Check the layers in the base model and see what dtype policy they're using
for layer in saved_model.layers[1].layers[:20]:
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_1 True float32 <Policy "float32"> rescaling False float32 <Policy "mixed_float16"> normalization False float32 <Policy "mixed_float16"> stem_conv_pad False float32 <Policy "mixed_float16"> stem_conv False float32 <Policy "mixed_float16"> stem_bn False float32 <Policy "mixed_float16"> stem_activation False float32 <Policy "mixed_float16"> block1a_dwconv False float32 <Policy "mixed_float16"> block1a_bn False float32 <Policy "mixed_float16"> block1a_activation False float32 <Policy "mixed_float16"> block1a_se_squeeze False float32 <Policy "mixed_float16"> block1a_se_reshape False float32 <Policy "mixed_float16"> block1a_se_reduce False float32 <Policy "mixed_float16"> block1a_se_expand False float32 <Policy "mixed_float16"> block1a_se_excite False float32 <Policy "mixed_float16"> block1a_project_conv False float32 <Policy "mixed_float16"> block1a_project_bn False float32 <Policy "mixed_float16"> block2a_expand_conv False float32 <Policy "mixed_float16"> block2a_expand_bn False float32 <Policy "mixed_float16"> block2a_expand_activation False float32 <Policy "mixed_float16">
# Check loaded model perfomance
saved_model_results = saved_model.evaluate(test_data)
saved_model_results
790/790 [==============================] - 18s 22ms/step - loss: 1.0897 - accuracy: 0.7090
[1.0897213220596313, 0.7090296745300293]
# The loaded model's results should be very close to the model's results prior to saving
np.isclose(results_feature_extract_model, saved_model_results).all()
True
Preparing our model's layers for fine tuning¶
# Set all of the layers .trainable variable in the loaded model to True (so they're unfrozen)
saved_model.layers[1].trainable = True
# for layer in base_model.layers[:-20]:
# layer.trainable = False
# print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
saved_model.compile(loss=tf.keras.losses.sparse_categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
metrics=["accuracy"])
# Check which layers are tuneable (trainable)
for layer_number, layer in enumerate(saved_model.layers[1].layers):
print(layer_number, layer.name, layer.trainable)
0 input_1 True 1 rescaling True 2 normalization True 3 stem_conv_pad True 4 stem_conv True 5 stem_bn True 6 stem_activation True 7 block1a_dwconv True 8 block1a_bn True 9 block1a_activation True 10 block1a_se_squeeze True 11 block1a_se_reshape True 12 block1a_se_reduce True 13 block1a_se_expand True 14 block1a_se_excite True 15 block1a_project_conv True 16 block1a_project_bn True 17 block2a_expand_conv True 18 block2a_expand_bn True 19 block2a_expand_activation True 20 block2a_dwconv_pad True 21 block2a_dwconv True 22 block2a_bn True 23 block2a_activation True 24 block2a_se_squeeze True 25 block2a_se_reshape True 26 block2a_se_reduce True 27 block2a_se_expand True 28 block2a_se_excite True 29 block2a_project_conv True 30 block2a_project_bn True 31 block2b_expand_conv True 32 block2b_expand_bn True 33 block2b_expand_activation True 34 block2b_dwconv True 35 block2b_bn True 36 block2b_activation True 37 block2b_se_squeeze True 38 block2b_se_reshape True 39 block2b_se_reduce True 40 block2b_se_expand True 41 block2b_se_excite True 42 block2b_project_conv True 43 block2b_project_bn True 44 block2b_drop True 45 block2b_add True 46 block3a_expand_conv True 47 block3a_expand_bn True 48 block3a_expand_activation True 49 block3a_dwconv_pad True 50 block3a_dwconv True 51 block3a_bn True 52 block3a_activation True 53 block3a_se_squeeze True 54 block3a_se_reshape True 55 block3a_se_reduce True 56 block3a_se_expand True 57 block3a_se_excite True 58 block3a_project_conv True 59 block3a_project_bn True 60 block3b_expand_conv True 61 block3b_expand_bn True 62 block3b_expand_activation True 63 block3b_dwconv True 64 block3b_bn True 65 block3b_activation True 66 block3b_se_squeeze True 67 block3b_se_reshape True 68 block3b_se_reduce True 69 block3b_se_expand True 70 block3b_se_excite True 71 block3b_project_conv True 72 block3b_project_bn True 73 block3b_drop True 74 block3b_add True 75 block4a_expand_conv True 76 block4a_expand_bn True 77 block4a_expand_activation True 78 block4a_dwconv_pad True 79 block4a_dwconv True 80 block4a_bn True 81 block4a_activation True 82 block4a_se_squeeze True 83 block4a_se_reshape True 84 block4a_se_reduce True 85 block4a_se_expand True 86 block4a_se_excite True 87 block4a_project_conv True 88 block4a_project_bn True 89 block4b_expand_conv True 90 block4b_expand_bn True 91 block4b_expand_activation True 92 block4b_dwconv True 93 block4b_bn True 94 block4b_activation True 95 block4b_se_squeeze True 96 block4b_se_reshape True 97 block4b_se_reduce True 98 block4b_se_expand True 99 block4b_se_excite True 100 block4b_project_conv True 101 block4b_project_bn True 102 block4b_drop True 103 block4b_add True 104 block4c_expand_conv True 105 block4c_expand_bn True 106 block4c_expand_activation True 107 block4c_dwconv True 108 block4c_bn True 109 block4c_activation True 110 block4c_se_squeeze True 111 block4c_se_reshape True 112 block4c_se_reduce True 113 block4c_se_expand True 114 block4c_se_excite True 115 block4c_project_conv True 116 block4c_project_bn True 117 block4c_drop True 118 block4c_add True 119 block5a_expand_conv True 120 block5a_expand_bn True 121 block5a_expand_activation True 122 block5a_dwconv True 123 block5a_bn True 124 block5a_activation True 125 block5a_se_squeeze True 126 block5a_se_reshape True 127 block5a_se_reduce True 128 block5a_se_expand True 129 block5a_se_excite True 130 block5a_project_conv True 131 block5a_project_bn True 132 block5b_expand_conv True 133 block5b_expand_bn True 134 block5b_expand_activation True 135 block5b_dwconv True 136 block5b_bn True 137 block5b_activation True 138 block5b_se_squeeze True 139 block5b_se_reshape True 140 block5b_se_reduce True 141 block5b_se_expand True 142 block5b_se_excite True 143 block5b_project_conv True 144 block5b_project_bn True 145 block5b_drop True 146 block5b_add True 147 block5c_expand_conv True 148 block5c_expand_bn True 149 block5c_expand_activation True 150 block5c_dwconv True 151 block5c_bn True 152 block5c_activation True 153 block5c_se_squeeze True 154 block5c_se_reshape True 155 block5c_se_reduce True 156 block5c_se_expand True 157 block5c_se_excite True 158 block5c_project_conv True 159 block5c_project_bn True 160 block5c_drop True 161 block5c_add True 162 block6a_expand_conv True 163 block6a_expand_bn True 164 block6a_expand_activation True 165 block6a_dwconv_pad True 166 block6a_dwconv True 167 block6a_bn True 168 block6a_activation True 169 block6a_se_squeeze True 170 block6a_se_reshape True 171 block6a_se_reduce True 172 block6a_se_expand True 173 block6a_se_excite True 174 block6a_project_conv True 175 block6a_project_bn True 176 block6b_expand_conv True 177 block6b_expand_bn True 178 block6b_expand_activation True 179 block6b_dwconv True 180 block6b_bn True 181 block6b_activation True 182 block6b_se_squeeze True 183 block6b_se_reshape True 184 block6b_se_reduce True 185 block6b_se_expand True 186 block6b_se_excite True 187 block6b_project_conv True 188 block6b_project_bn True 189 block6b_drop True 190 block6b_add True 191 block6c_expand_conv True 192 block6c_expand_bn True 193 block6c_expand_activation True 194 block6c_dwconv True 195 block6c_bn True 196 block6c_activation True 197 block6c_se_squeeze True 198 block6c_se_reshape True 199 block6c_se_reduce True 200 block6c_se_expand True 201 block6c_se_excite True 202 block6c_project_conv True 203 block6c_project_bn True 204 block6c_drop True 205 block6c_add True 206 block6d_expand_conv True 207 block6d_expand_bn True 208 block6d_expand_activation True 209 block6d_dwconv True 210 block6d_bn True 211 block6d_activation True 212 block6d_se_squeeze True 213 block6d_se_reshape True 214 block6d_se_reduce True 215 block6d_se_expand True 216 block6d_se_excite True 217 block6d_project_conv True 218 block6d_project_bn True 219 block6d_drop True 220 block6d_add True 221 block7a_expand_conv True 222 block7a_expand_bn True 223 block7a_expand_activation True 224 block7a_dwconv True 225 block7a_bn True 226 block7a_activation True 227 block7a_se_squeeze True 228 block7a_se_reshape True 229 block7a_se_reduce True 230 block7a_se_expand True 231 block7a_se_excite True 232 block7a_project_conv True 233 block7a_project_bn True 234 top_conv True 235 top_bn True 236 top_activation True
# Setup EarlyStopping callback
earlystopping_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)
# Setup ModelCheckpoint callback
checkpoint_path = "food_vision_model_checkpoints/fine_tune_ckpt"
model_chekpoint = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
monitor="val_accuracy",
save_best_only=True,
verbose=1)
# Start to fine tune the model
history_fine_tuned_model = saved_model.fit(train_data, epochs=100,
steps_per_epoch=len(train_data),
validation_data=test_data,
initial_epoch=history_food_vision_big_feature_extract_model.epoch[-1],
validation_steps=int(0.15 * len(test_data)),
callbacks=[create_tensorboard_callback("training_logs_food_vision",
"efficientnetb0_all_data_fine_tune"),
model_chekpoint, earlystopping_callback])
Saving TensorBoard log files to: training_logs_food_vision/efficientnetb0_all_data_fine_tune/20221216-011652 Epoch 3/100 2368/2368 [==============================] - ETA: 0s - loss: 0.9253 - accuracy: 0.7519 Epoch 3: val_accuracy improved from -inf to 0.77463, saving model to food_vision_model_checkpoints/fine_tune_ckpt 2368/2368 [==============================] - 210s 86ms/step - loss: 0.9253 - accuracy: 0.7519 - val_loss: 0.7822 - val_accuracy: 0.7746 Epoch 4/100 2367/2368 [============================>.] - ETA: 0s - loss: 0.5811 - accuracy: 0.8405 Epoch 4: val_accuracy improved from 0.77463 to 0.78469, saving model to food_vision_model_checkpoints/fine_tune_ckpt 2368/2368 [==============================] - 203s 85ms/step - loss: 0.5811 - accuracy: 0.8405 - val_loss: 0.7919 - val_accuracy: 0.7847 Epoch 5/100 2367/2368 [============================>.] - ETA: 0s - loss: 0.3319 - accuracy: 0.9051 Epoch 5: val_accuracy did not improve from 0.78469 2368/2368 [==============================] - 185s 78ms/step - loss: 0.3319 - accuracy: 0.9051 - val_loss: 0.8681 - val_accuracy: 0.7709 Epoch 6/100 2367/2368 [============================>.] - ETA: 0s - loss: 0.1771 - accuracy: 0.9470 Epoch 6: val_accuracy improved from 0.78469 to 0.78522, saving model to food_vision_model_checkpoints/fine_tune_ckpt 2368/2368 [==============================] - 203s 86ms/step - loss: 0.1771 - accuracy: 0.9470 - val_loss: 0.9279 - val_accuracy: 0.7852
fine_tuned_model_results = saved_model.evaluate(test_data)
fine_tuned_model_results
790/790 [==============================] - 17s 22ms/step - loss: 0.9488 - accuracy: 0.7797
[0.9487667679786682, 0.7797227501869202]
saved_model.save('food_vision_model_fine_tuned.h5', overwrite=True)
# Load fine_tuned model
!wget -O food_vision_model_fine_tuned.h5 https://storage.googleapis.com/tf_dev_exam_prep_resources/saved_models/food_vision_model_fine_tuned.h5
saved_fine_tuned_model = tf.keras.models.load_model('food_vision_model_fine_tuned.h5')
--2022-12-16 01:37:40-- https://storage.googleapis.com/tf_dev_exam_prep_resources/saved_models/food_vision_model_fine_tuned.h5 Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.182.80, 142.250.182.112, 142.250.182.144, ... Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.182.80|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 50754364 (48M) [application/x-hdf] Saving to: ‘food_vision_model_fine_tuned.h5’ food_vision_model_f 100%[===================>] 48.40M 11.2MB/s in 4.4s 2022-12-16 01:37:44 (11.0 MB/s) - ‘food_vision_model_fine_tuned.h5’ saved [50754364/50754364]
# Check evaluation
saved_fine_tuned_model_results = saved_fine_tuned_model.evaluate(test_data)
saved_fine_tuned_model_results
790/790 [==============================] - 18s 22ms/step - loss: 0.9488 - accuracy: 0.7797
[0.9487666487693787, 0.7797227501869202]
np.isclose(saved_fine_tuned_model_results, fine_tuned_model_results).all()
True
The End....